16 research outputs found

    Retrospective Higher-Order Markov Processes for User Trails

    Full text link
    Users form information trails as they browse the web, checkin with a geolocation, rate items, or consume media. A common problem is to predict what a user might do next for the purposes of guidance, recommendation, or prefetching. First-order and higher-order Markov chains have been widely used methods to study such sequences of data. First-order Markov chains are easy to estimate, but lack accuracy when history matters. Higher-order Markov chains, in contrast, have too many parameters and suffer from overfitting the training data. Fitting these parameters with regularization and smoothing only offers mild improvements. In this paper we propose the retrospective higher-order Markov process (RHOMP) as a low-parameter model for such sequences. This model is a special case of a higher-order Markov chain where the transitions depend retrospectively on a single history state instead of an arbitrary combination of history states. There are two immediate computational advantages: the number of parameters is linear in the order of the Markov chain and the model can be fit to large state spaces. Furthermore, by providing a specific structure to the higher-order chain, RHOMPs improve the model accuracy by efficiently utilizing history states without risks of overfitting the data. We demonstrate how to estimate a RHOMP from data and we demonstrate the effectiveness of our method on various real application datasets spanning geolocation data, review sequences, and business locations. The RHOMP model uniformly outperforms higher-order Markov chains, Kneser-Ney regularization, and tensor factorizations in terms of prediction accuracy

    Music recommendation and discovery in the long tail

    Get PDF
    Avui en dia, la música està esbiaixada cap al consum d'alguns artistes molt populars. Per exemple, el 2007 només l'1% de totes les cançons en format digital va representar el 80% de les vendes. De la mateixa manera, només 1.000 àlbums varen representar el 50% de totes les vendes, i el 80% de tots els àlbums venuts es varen comprar menys de 100 vegades. Es clar que hi ha una necessitat per tal d'ajudar a les persones a filtrar, descobrir, personalitzar i recomanar música, a partir de l'enorme quantitat de contingut musical disponible. Els algorismes de recomanació de música actuals intenten predir amb precisió el que els usuaris demanen escoltar. Tanmateix, molt sovint aquests algoritmes tendeixen a recomanar artistes famosos, o coneguts d'avantmà per l'usuari. Això fa que disminueixi l'eficàcia i utilitat de les recomanacions, ja que aquests algorismes es centren bàsicament en millorar la precisió de les recomanacions. És a dir, tracten de fer prediccions exactes sobre el que un usuari pugui escoltar o comprar, independentment de quant útils siguin les recomanacions generades. En aquesta tesi destaquem la importància que l'usuari valori les recomanacions rebudes. Per aquesta raó modelem la corba de popularitat dels artistes, per tal de poder recomanar música interessant i desconeguda per l'usuari. Les principals contribucions d'aquesta tesi són: (i) un nou enfocament basat en l'anàlisi de xarxes complexes i la popularitat dels productes, aplicada als sistemes de recomanació, (ii) una avaluació centrada en l'usuari, que mesura la importància i la desconeixença de les recomanacions, i (iii) dos prototips que implementen la idees derivades de la tasca teòrica. Els resultats obtinguts tenen una clara implicació per aquells sistemes de recomanació que ajuden a l'usuari a explorar i descobrir continguts que els pugui agradar.Actualmente, el consumo de música está sesgada hacia algunos artistas muy populares. Por ejemplo, en el año 2007 sólo el 1% de todas las canciones en formato digital representaron el 80% de las ventas. De igual modo, únicamente 1.000 álbumes representaron el 50% de todas las ventas, y el 80% de todos los álbumes vendidos se compraron menos de 100 veces. Existe, pues, una necesidad de ayudar a los usuarios a filtrar, descubrir, personalizar y recomendar música a partir de la enorme cantidad de contenido musical existente. Los algoritmos de recomendación musical existentes intentan predecir con precisión lo que la gente quiere escuchar. Sin embargo, muy a menudo estos algoritmos tienden a recomendar o bien artistas famosos, o bien artistas ya conocidos de antemano por el usuario.Esto disminuye la eficacia y la utilidad de las recomendaciones, ya que estos algoritmos se centran en mejorar la precisión de las recomendaciones. Con lo cuál, tratan de predecir lo que un usuario pudiera escuchar o comprar, independientemente de lo útiles que sean las recomendaciones generadas. En este sentido, la tesis destaca la importancia de que el usuario valore las recomendaciones propuestas. Para ello, modelamos la curva de popularidad de los artistas con el fin de recomendar música interesante y, a la vez, desconocida para el usuario.Las principales contribuciones de esta tesis son: (i) un nuevo enfoque basado en el análisis de redes complejas y la popularidad de los productos, aplicada a los sistemas de recomendación,(ii) una evaluación centrada en el usuario que mide la calidad y la novedad de las recomendaciones, y (iii) dos prototipos que implementan las ideas derivadas de la labor teórica. Los resultados obtenidos tienen importantes implicaciones para los sistemas de recomendación que ayudan al usuario a explorar y descubrir contenidos que le puedan gustar.Music consumption is biased towards a few popular artists. For instance, in 2007 only 1% of all digital tracks accounted for 80% of all sales. Similarly, 1,000 albums accounted for 50% of all album sales, and 80% of all albums sold were purchased less than 100 times. There is a need to assist people to filter, discover, personalise and recommend from the huge amount of music content available along the Long Tail.Current music recommendation algorithms try to accurately predict what people demand to listen to. However, quite often these algorithms tend to recommend popular -or well-known to the user- music, decreasing the effectiveness of the recommendations. These approaches focus on improving the accuracy of the recommendations. That is, try to make accurate predictions about what a user could listen to, or buy next, independently of how useful to the user could be the provided recommendations. In this Thesis we stress the importance of the user's perceived quality of the recommendations. We model the Long Tail curve of artist popularity to predict -potentially- interesting and unknown music, hidden in the tail of the popularity curve. Effective recommendation systems should promote novel and relevant material (non-obvious recommendations), taken primarily from the tail of a popularity distribution. The main contributions of this Thesis are: (i) a novel network-based approach for recommender systems, based on the analysis of the item (or user) similarity graph, and the popularity of the items, (ii) a user-centric evaluation that measures the user's relevance and novelty of the recommendations, and (iii) two prototype systems that implement the ideas derived from the theoretical work. Our findings have significant implications for recommender systems that assist users to explore the Long Tail, digging for content they might like

    Bridging the Music Semantic Gap

    No full text
    In this paper we present the music information plane and the dfferent levels of information extraction that exist in the musical domain. Based on this approach we propose a way to overcome the existing semantic gap in the music field. Our approximation is twofold: we propose a set of music descriptors that can automatically be extracted from the audio signals, and a top-down approach that adds explicit and formal semantics to these annotations. These music descriptors are generated in two ways: as derivations and combinations of lower-level descriptors and as generalizations induced from manually annotated databases by the intensive application of machine learning. We belive that merging both approaches (bottom-up and top-down) can overcome the existing semantic gap in the musical domain.The reported research has been funded by the EU-FP6-IST-507142 project SIMAC (Semantic Interaction with Music Audio Contents)

    Bridging the Music Semantic Gap

    No full text
    In this paper we present the music information plane and the dfferent levels of information extraction that exist in the musical domain. Based on this approach we propose a way to overcome the existing semantic gap in the music field. Our approximation is twofold: we propose a set of music descriptors that can automatically be extracted from the audio signals, and a top-down approach that adds explicit and formal semantics to these annotations. These music descriptors are generated in two ways: as derivations and combinations of lower-level descriptors and as generalizations induced from manually annotated databases by the intensive application of machine learning. We belive that merging both approaches (bottom-up and top-down) can overcome the existing semantic gap in the musical domain.The reported research has been funded by the EU-FP6-IST-507142 project SIMAC (Semantic Interaction with Music Audio Contents)

    Inferring semantic facets of a music folksonomy with wikipedia

    No full text
    Music folksonomies include both general and detailed descriptions of music, and are usually continuously updated. These are significant advantages over music taxonomies, which tend to be incomplete and inconsistent. However, music folksonomies have an inherent loose and open semantics, which hampers their use in many applications, such as structured music browsing and recommendation. In this paper, we present a system that can (1) automatically obtain a set of semantic facets underlying the folksonomy of the social music website Last.fm, and (2) categorize Last.fm tags with respect to the obtained facets. The semantic facets are anchored upon the structure of Wikipedia, a dynamic repository of universal knowledge.Fabien Gouyon is supported by the Media Arts and Technologies project (MAT), NORTE-07-0124-FEDER-000061, co-financed by the North Portugal Regional Operational Programme (ON.2 O Novo Norte), under the National Strategic Reference Framework (NSRF), through the European Regional Development Fund (ERDF), and by national funds, through the Portuguese funding agency, Fundação para a Ciência e a Tecnologia (FCT).”

    Extending the folksonomies of freesound.org using content-based audio analysis

    No full text
    Comunicació presentada a la 6th Sound and Music Computing Conference, celebrada els dies 23 a 25 de juliol de 2009 a Porto, Portugal.This paper presents an in–depth study of the social tagging mechanisms used in Freesound.org, an online community where users share and browse audio files by means of tags and content–based audio similarity search. We performed two analyses of the sound collection. The first one is related with how the users tag the sounds, and we could detect some well–known problems that occur in collaborative tagging systems (i.e. polysemy, synonymy, and the scarcity of the existing annotations). Moreover, we show that more than 10% of the collection were scarcely annotated with only one or two tags per sound, thus frustrating the retrieval task. In this sense, the second analysis focuses on enhancing the semantic annotations of these sounds, by means of content– based audio similarity (autotagging). In order to “autotag” the sounds, we use a k–NN classifier that selects the available tags from the most similar sounds. Human assessment is performed in order to evaluate the perceived quality of the candidate tags. The results show that, in 77% of the sounds used, the annotations have been correctly extended with the proposed tags derived from audio similarity

    Inferring semantic facets of a music folksonomy with wikipedia

    No full text
    Music folksonomies include both general and detailed descriptions of music, and are usually continuously updated. These are significant advantages over music taxonomies, which tend to be incomplete and inconsistent. However, music folksonomies have an inherent loose and open semantics, which hampers their use in many applications, such as structured music browsing and recommendation. In this paper, we present a system that can (1) automatically obtain a set of semantic facets underlying the folksonomy of the social music website Last.fm, and (2) categorize Last.fm tags with respect to the obtained facets. The semantic facets are anchored upon the structure of Wikipedia, a dynamic repository of universal knowledge.Fabien Gouyon is supported by the Media Arts and Technologies project (MAT), NORTE-07-0124-FEDER-000061, co-financed by the North Portugal Regional Operational Programme (ON.2 O Novo Norte), under the National Strategic Reference Framework (NSRF), through the European Regional Development Fund (ERDF), and by national funds, through the Portuguese funding agency, Fundação para a Ciência e a Tecnologia (FCT).”

    Extending the folksonomies of freesound.org using content-based audio analysis

    No full text
    Comunicació presentada a la 6th Sound and Music Computing Conference, celebrada els dies 23 a 25 de juliol de 2009 a Porto, Portugal.This paper presents an in–depth study of the social tagging mechanisms used in Freesound.org, an online community where users share and browse audio files by means of tags and content–based audio similarity search. We performed two analyses of the sound collection. The first one is related with how the users tag the sounds, and we could detect some well–known problems that occur in collaborative tagging systems (i.e. polysemy, synonymy, and the scarcity of the existing annotations). Moreover, we show that more than 10% of the collection were scarcely annotated with only one or two tags per sound, thus frustrating the retrieval task. In this sense, the second analysis focuses on enhancing the semantic annotations of these sounds, by means of content– based audio similarity (autotagging). In order to “autotag” the sounds, we use a k–NN classifier that selects the available tags from the most similar sounds. Human assessment is performed in order to evaluate the perceived quality of the candidate tags. The results show that, in 77% of the sounds used, the annotations have been correctly extended with the proposed tags derived from audio similarity

    Mucosa: a music content semantic annotator

    No full text
    Comunicació presentada a: ISMIR 2005 6th International Conference on Music Information Retrieval, celebrada de l'11 al 15 de setembre de 2005 a Londres, Regne UnitMUCOSA (Music Content Semantic Annotator) is an environment for the annotation and generation of music metadata at different levels of abstraction. It is composed of three tiers: an annotation client that deals with microannotations (i.e. within-file annotations), a collection tagger, which deals with macro-annotations (i.e. acrossfiles annotations), and a collaborative annotation subsystem, which manages large-scale annotation tasks that can be shared among different research centres. The annotation client is an enhanced version of WaveSurfer, a speech annotation tool. The collection tagger includes tools for automatic generation of unary descriptors, invention of new descriptors, and propagation of descriptors across sub-collections or playlists. Finally, the collaborative annotation subsystem, based on Plone, makes possible to share the annotation chores and results between several research institutions. A collection of annotated songs is available, as a “starter pack” to all the individuals or institutions that are eager to join this initiative.The research and development reported here was partially funded by the EU-FP6-IST-507142 project SIMAC (Semantic Interaction with Music Audio Contents) project. The authors would like to thank Edgar Barroso, and the Audioclas and CLAM teams for their support to the project

    Singing voice synthesis combining excitation plus resonance and sinusoidal plus residual models

    No full text
    This paper presents an approach to the modeling of the singing voice with a particular emphasis on the naturalness of the resulting synthetic voice. The underlying analysis/synthesis technique is based on the Spectral Modeling Synthesis (SMS) and a newly developed Excitation plus Resonance (EpR) model. With this approach a complete singing voice synthesizer is developed that generates a vocal melody out of the score and the phonetic transcription of a song
    corecore